On the Distribution of K-tuple Matches for Sequence Homology: A Constant Time Exact Calculation of the Variance

نویسندگان

  • Gary Benson
  • X. Su
چکیده

We study the distribution of a statistic useful in calculating the significance of the number of k-tuple matches detected in biological sequence homology algorithms. The statistic is Rn,k, the total number of heads in head runs of length k or more in a sequence of iid Bernoulli trials of length n. Calculation of the mean is straightforward. Poisson approximation formulas have been used for the variance because they are simple and powerful. Unfortunately, when p = P(Head) is large, the Poisson approximation no longer works well. In our application, p is large, say .75, and we have turned instead to direct calculation of the variance. Surprisingly, we are able to show that the variance, which is based on the interactions of O(n2) random variables, can be computed in constant time, independent of the length of the sequence and probability p. This result can be used to calculate the mean and variance of a number of other head run statistics in constant time. Additionally, we show how to extend the result to sequences generated by a stationary Markov process where the variance can be calculated in O(n) time.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exact sequences of extended $d$-homology

In this article, we show the existence of certain exact sequences with respect to two homology theories, called d-homology and extended d-homology. We present sufficient conditions for the existence of long exact extended d- homology sequence. Also we give some illustrative examples.

متن کامل

Roman k-Tuple Domination in Graphs

For any integer $kgeq 1$ and any graph $G=(V,E)$ with minimum degree at least $k-1$‎, ‎we define a‎ ‎function $f:Vrightarrow {0,1,2}$ as a Roman $k$-tuple dominating‎ ‎function on $G$ if for any vertex $v$ with $f(v)=0$ there exist at least‎ ‎$k$ and for any vertex $v$ with $f(v)neq 0$ at least $k-1$ vertices in its neighborhood with $f(w)=2$‎. ‎The minimum weight of a Roman $k$-tuple dominatin...

متن کامل

$k$-tuple total restrained domination/domatic in graphs

‎For any integer $kgeq 1$‎, ‎a set $S$ of vertices in a graph $G=(V,E)$ is a $k$-‎tuple total dominating set of $G$ if any vertex‎ ‎of $G$ is adjacent to at least $k$ vertices in $S$‎, ‎and any vertex‎ ‎of $V-S$ is adjacent to at least $k$ vertices in $V-S$‎. ‎The minimum number of vertices of such a set‎ ‎in $G$ we call the $k$-tuple total restrained domination number of $G$‎. ‎The maximum num...

متن کامل

k-TUPLE DOMATIC IN GRAPHS

For every positive integer k, a set S of vertices in a graph G = (V;E) is a k- tuple dominating set of G if every vertex of V -S is adjacent to at least k vertices and every vertex of S is adjacent to at least k - 1 vertices in S. The minimum cardinality of a k-tuple dominating set of G is the k-tuple domination number of G. When k = 1, a k-tuple domination number is the well-studied domination...

متن کامل

Calculation of Natural Frequencies of Bi-Layered Rotating Functionally Graded Cylindrical Shells

In this paper, an exact analytical solution for free vibration of rotating bi-layered cylindrical shell composed of two independent functionally graded layers was presented. The thicknesses of the shell layers were assumed to be equal and constant. The material properties of the constituents of bi-layered FGM cylindrical shell were graded in the thickness direction of the layers of the shell ac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of computational biology : a journal of computational molecular cell biology

دوره 5 1  شماره 

صفحات  -

تاریخ انتشار 1998